텐서플로에서 Variable은 훈련하는 동안 모델 파라미터를 저장하고 업데이트할 수 있는 Tensor의 객체이다.

tf.Variable

float32, int32, bool, string …

a=tf.Variable(initial_value=3.14, name='var_a')

print(a)

<tf.Variable 'var_a:0' shape=() dtype=float32, numpy=3.14>

b=tf.Variable(initial_value=[1, 2, 3], name='var_b')

print(b)

<tf.Variable 'var_b:0' shape=(3,) dtype=int32, numpy=array([1, 2, 3], dtype=int32)>

c=tf.Variable(initial_value=[True, False], dtype=tf.bool)

print(c)

<tf.Variable 'Variable:0' shape=(2,) dtype=bool, numpy=array([ True, False])>

d=tf.Variable(initial_value=['abc'], dtype=tf.string)

print(d)

<tf.Variable 'Variable:0' shape=(1,) dtype=string, numpy=array([b'abc'], dtype=object)>

Variable을 만들 때 항상 초기값을 지정해 줘야 한다.

변수는 기본적으로 trainable 속성을 가지며, 기본값은 True이다

케라스같은 고수준 API에서 trainable 속성을 이용해서 훈련하는 변수와 훈련하지 않는 변수를 관리한다.

훈련하지 않는 변수

w=tf.Variable([1, 2, 3], trainable=False)

print(w.trainable)

False

Variable 수정 .assign() & .assign_add()

print(w.assign([3, 1, 4], read_value=True))

<tf.Variable 'UnreadVariable' shape=(3,) dtype=int32, numpy=array([3, 1, 4], dtype=int32)>

기존 [1, 2, 3]에서 [3, 1, 4]로 변수값 변경

w.assign_add([2, -1, 2], read_value=True)

<tf.Variable 'UnreadVariable' shape=(3,) dtype=int32, numpy=array([5, 0, 6], dtype=int32)>

print(w.value())

tf.Tensor([5 0 6], shape=(3,), dtype=int32)

read_value=True(True is default)이면 연산은 Variable의 현재 값을 업데이트하고나서 새로운 값을 자동으로 반환한다.

w.value()를 호출하면, 텐서 포맷으로 값을 반환한다.

할당연산에서 Variable의 크기나 타입을 바꿀 수 없다.

신경망 모델은 역전파하는 동안 대칭성을 깨기 위해서 모델 파라미터를 랜덤한 가중치로 초기화해야 한다.

텐서플로의 Variable을 만들 때, 랜덤한 초기화 방법을 사용할 수 있다.(tf.random)

Variable 글루럿(Glorot) 초기화

tf.random.set_seed(1)

init=tf.keras.initializers.GlorotNormal()

tf.print(init(shape=(3,)))

[-0.722795904 1.01456821 0.251808226]

v=tf.Variable(init(shape=(2, 3)))

tf.print(v)

[[0.28982234 -0.782292783 -0.0453658961]

[0.960991383 -0.120003454 0.708528221]]

세이비어(or 글로럿) 초기화

무작위한 균등 분포나 정규 분포를 사용한 가중치 초기화는 모델을 훈련할 때, 나쁜 성능을 만든다.

세이비어 초기화의 아이디어는 여러 층을 거치는 그레이디언트 분산 사이에서 균형을 맞추는 것이다.

균형을 맞추지 않으면, 훈련하는 동안 너무 주목을 받거나, 다른 층이 학습에 뒤처 진다.

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/initializers

class MyModule(tf.Module):

def __init__(self):

init=tf.keras.initializers.GlorotNormal()

self.w1=tf.Variable(init(shape=(2, 3)), trainable=True)

self.w2=tf.Variable(init(shape=(1, 2)), trainable=False)

m=MyModule()

print('모든 변수:', [v.shape for v in m.variables])

print('훈련 가능한 변수:', [v.shape for v in m.trainable_variables])

모든 변수: [TensorShape([2, 3]), TensorShape([1, 2])]

훈련 가능한 변수: [TensorShape([2, 3])]

tf.Module 클래스를 상속하면 해당 객체 안에 저장된 모든 변수를 .variables 속성을 통해 직접 참조할 수 있다.

케라스의 모든 층은 tf.Module 클래스를 상속한다.

tf.function 데코레이터가 적용되지 않은 함수에서 Variable 객체를 정의하면 함수가 호출될 때마다 새로운 Variable 객체가 생성된다.

tf.function은 트레이싱과 그래프 생성을 통해 Variable 객체를 재사용한다.

따라서 데코레이터가 적용된 함수 안에서 Variable 객체를 만들지 못하게 하려고 에러가 발생한다.

@tf.function

def f(x):

w=tf.Variable([1, 2, 3])

f([1])

ValueError(

ValueError: tf.function-decorated function tried to create variables on non-first call.

데코레이터가 적용된 함수 밖에서 Variable 객체를 정의하고 함수에서 이를 사용해야 한다.

w=tf.Variable(tf.random.uniform((3, 3)))

@tf.function

def compute_z(x):

return tf.matmul(w, x)

x=tf.constant([[1], [2], [3]], dtype=tf.float32)

tf.print(compute_z(x))

[[2.93155909]

[2.01902]

[3.25393867]]

Tensorflow Variables